منابع مشابه
Discovering data quality rules
Dirty data is a serious problem for businesses leading to incorrect decision making, inefficient daily operations, and ultimately wasting both time and money. Dirty data often arises when domain constraints and business rules, meant to preserve data consistency and accuracy, are enforced incompletely or not at all in application code. In this work, we propose a new data-driven tool that can be ...
متن کاملDiscovering regression data quality through clustering methods
We propose the use of clustering methods in order to discover the quality of each element in a training set to be subsequently fed to a regression algorithm. The paper shows that these methods, used in combination with regression algorithms taking into account the additional information conveyed by this kind of quality, allow the attainment of higher performances than those obtained through sta...
متن کاملDiscovering Data Quality Rules in a Master Data Management
Dirty data continues to be an important issue for companies. The datawarehouse institute [Eckerson, 2002], [Rockwell, 2012] stated poor data costs US businesses $611 billion dollars annually and erroneously priced data in retail databases costs US customers $2.5 billion each year. Data quality becomes more and more critical. The database community pays a particular attention to this subject whe...
متن کاملA Data Driven Approach for Discovering Data Quality Requirements
Existing methodologies for identifying data quality issues are inevitably user-centric, wherein data quality requirements are determined in a top-down manner following organizational structures and data governance frameworks. In the current data landscape, however, users are often confronted with new, unexplored data sets that may have relevance and potential to create value. In such scenarios ...
متن کاملDiscovering Pattern Tableaux for Data Quality Analysis: a Case Study
In this paper, we present a case study that illustrates the utility of pattern tableau discovery for data quality analysis. Given a usersupplied integrity constraint, such as a boolean predicate expected to be satisfied by every tuple, a functional dependency, or an inclusion dependency, a pattern tableau is a concise summary of subsets of the data that satisfy or fail the constraint. We descri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Business & Information Systems Engineering
سال: 2019
ISSN: 2363-7005,1867-0202
DOI: 10.1007/s12599-019-00608-0